La Rioja
Multimodal Integrated Knowledge Transfer to Large Language Models through Preference Optimization with Biomedical Applications
Wu, Da, Wang, Zhanliang, Nguyen, Quan, Xu, Zhuoran, Wang, Kai
The scarcity of high-quality multimodal biomedical data limits the ability to effectively fine-tune pretrained Large Language Models (LLMs) for specialized biomedical tasks. To address this challenge, we introduce MINT (Multimodal Integrated kNowledge Transfer), a framework that aligns unimodal large decoder models with domain-specific decision patterns from multimodal biomedical data through preference optimization. While MINT supports different optimization techniques, we primarily implement it with the Odds Ratio Preference Optimization (ORPO) framework as its backbone. This strategy enables the aligned LLMs to perform predictive tasks using text-only or image-only inputs while retaining knowledge learnt from multimodal data. MINT leverages an upstream multimodal machine learning (MML) model trained on high-quality multimodal data to transfer domain-specific insights to downstream text-only or image-only LLMs. We demonstrate its effectiveness through two key applications: (1) Rare genetic disease prediction from texts, where MINT uses a multimodal encoder model, trained on facial photos and clinical notes, to generate a preference dataset for aligning a lightweight Llama 3.2-3B-Instruct. Despite relying on text input only, the MINT-derived model outperforms models trained with SFT, RAG, or DPO, and even outperforms Llama 3.1-405B-Instruct. (2) Tissue type classification using cell nucleus images, where MINT uses a vision-language foundation model as the preference generator, containing knowledge learnt from both text and histopathological images to align downstream image-only models. The resulting MINT-derived model significantly improves the performance of Llama 3.2-Vision-11B-Instruct on tissue type classification. In summary, MINT provides an effective strategy to align unimodal LLMs with high-quality multimodal expertise through preference optimization.
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
- Europe > Spain > La Rioja > Logroño (0.04)
CUPCase: Clinically Uncommon Patient Cases and Diagnoses Dataset
Perets, Oriel, Shoham, Ofir Ben, Grinberg, Nir, Rappoport, Nadav
Medical benchmark datasets significantly contribute to developing Large Language Models (LLMs) for medical knowledge extraction, diagnosis, summarization, and other uses. Yet, current benchmarks are mainly derived from exam questions given to medical students or cases described in the medical literature, lacking the complexity of real-world patient cases that deviate from classic textbook abstractions. These include rare diseases, uncommon presentations of common diseases, and unexpected treatment responses. Here, we construct Clinically Uncommon Patient Cases and Diagnosis Dataset (CUPCase) based on 3,562 real-world case reports from BMC, including diagnoses in open-ended textual format and as multiple-choice options with distractors. Using this dataset, we evaluate the ability of state-of-the-art LLMs, including both general-purpose and Clinical LLMs, to identify and correctly diagnose a patient case, and test models' performance when only partial information about cases is available. Our findings show that general-purpose GPT-4o attains the best performance in both the multiple-choice task (average accuracy of 87.9%) and the open-ended task (BERTScore F1 of 0.764), outperforming several LLMs with a focus on the medical domain such as Meditron-70B and MedLM-Large. Moreover, GPT-4o was able to maintain 87% and 88% of its performance with only the first 20% of tokens of the case presentation in multiple-choice and free text, respectively, highlighting the potential of LLMs to aid in early diagnosis in real-world cases. CUPCase expands our ability to evaluate LLMs for clinical decision support in an open and reproducible manner.
- North America > United States > Massachusetts (0.04)
- Europe > Spain > La Rioja > Logroño (0.04)
- Europe > Monaco (0.04)
- Asia > Middle East > Israel (0.04)
- Health & Medicine > Diagnostic Medicine (1.00)
- Education (1.00)
- Health & Medicine > Health Care Providers & Services (0.93)
- (3 more...)
Evaluation for Regression Analyses on Evolving Data Streams
Sun, Yibin, Gomes, Heitor Murilo, Pfahringer, Bernhard, Bifet, Albert
The paper explores the challenges of regression analysis in evolving data streams, an area that remains relatively underexplored compared to classification. We propose a standardized evaluation process for regression and prediction interval tasks in streaming contexts. Additionally, we introduce an innovative drift simulation strategy capable of synthesizing various drift types, including the less-studied incremental drift. Comprehensive experiments with state-of-the-art methods, conducted under the proposed process, validate the effectiveness and robustness of our approach.
- Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
- Oceania > New Zealand > North Island > Waikato > Hamilton (0.04)
- Oceania > New Zealand > North Island > Wellington Region > Wellington (0.04)
- (5 more...)
- Research Report > New Finding (0.84)
- Research Report > Experimental Study (0.60)
Developing Enhanced Conversational Agents for Social Virtual Worlds
Griol, D., Sanchis, A., Molina, J. M., Callejas, Z.
In this paper, we present a methodology for the development of embodied conversational agents for social virtual worlds. The agents provide multimodal communication with their users in which speech interaction is included. Our proposal combines different techniques related to Artificial Intelligence, Natural Language Processing, Affective Computing, and User Modeling. Firstly, the developed conversational agents. A statistical methodology has been developed to model the system conversational behavior, which is learned from an initial corpus and improved with the knowledge acquired from the successive interactions. In addition, the selection of the next system response is adapted considering information stored into users profiles and also the emotional contents detected in the users utterances. Our proposal has been evaluated with the successful development of an embodied conversational agent which has been placed in the Second Life social virtual world. The avatar includes the different models and interacts with the users who inhabit the virtual world in order to provide academic information. The experimental results show that the agents conversational behavior adapts successfully to the specific characteristics of users interacting in such environments.
- Europe > Spain > Galicia > Madrid (0.04)
- Europe > Netherlands > South Holland > Dordrecht (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- (12 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.93)
Generative AI in Medicine
Shanmugam, Divya, Agrawal, Monica, Movva, Rajiv, Chen, Irene Y., Ghassemi, Marzyeh, Jacobs, Maia, Pierson, Emma
Excitement about the promise of generative AI in medicine has inspired an explosion of new applications. Generative models have the potential to change how care is delivered (1-5), the roles and responsibilities of care providers (6, 7), and the communication pathways between patients and providers (8, 9). Further upstream, generative models have shown promise in improving scientific discovery in medicine (through both clinical trials (10, 11) and observational research (12, 13)) and facilitating medical education (8, 14). These developments are a direct result of technical advances in generative AI, which have drastically increased the ability to generate realistic language and images, and raise important questions about how to integrate generative models into medicine. Generative AI is the latest in a series of technical advances that have driven major shifts in medicine. Past significant advances include the adoption of electronic health records (EHRs); the integration of robotics into telesurgeries (15); and the incorporation of predictive models and continuous monitoring as foundational infrastructure for new diagnostic tools (16, 17).
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Illinois > Cook County > Evanston (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- (8 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
A Machine Learning Approach for the Efficient Estimation of Ground-Level Air Temperature in Urban Areas
Delgado-Enales, Iñigo, Lizundia-Loiola, Joshua, Molina-Costa, Patricia, Del Ser, Javier
The increasingly populated cities of the 21st Century face the challenge of being sustainable and resilient spaces for their inhabitants. However, climate change, among other problems, makes these objectives difficult to achieve. The Urban Heat Island (UHI) phenomenon that occurs in cities, increasing their thermal stress, is one of the stumbling blocks to achieve a more sustainable city. The ability to estimate temperatures with a high degree of accuracy allows for the identification of the highest priority areas in cities where urban improvements need to be made to reduce thermal discomfort. In this work we explore the usefulness of image-to-image deep neural networks (DNNs) for correlating spatial and meteorological variables of a urban area with street-level air temperature. The air temperature at street-level is estimated both spatially and temporally for a specific use case, and compared with existing, well-established numerical models. Based on the obtained results, deep neural networks are confirmed to be faster and less computationally expensive alternative for ground-level air temperature compared to numerical models.
- South America > Brazil (0.14)
- Europe > Spain > Basque Country > Biscay Province > Bilbao (0.04)
- Europe > Portugal > Lisbon > Lisbon (0.04)
- (18 more...)
- Energy (1.00)
- Construction & Engineering (0.93)
- Government (0.67)
- (2 more...)
MessIRve: A Large-Scale Spanish Information Retrieval Dataset
Valentini, Francisco, Cotik, Viviana, Furman, Damián, Bercovich, Ivan, Altszyler, Edgar, Pérez, Juan Manuel
Information retrieval (IR) is the task of finding relevant documents in response to a user query. Although Spanish is the second most spoken native language, current IR benchmarks lack Spanish data, hindering the development of information access tools for Spanish speakers. We introduce MessIRve, a large-scale Spanish IR dataset with around 730 thousand queries from Google's autocomplete API and relevant documents sourced from Wikipedia. MessIRve's queries reflect diverse Spanish-speaking regions, unlike other datasets that are translated from English or do not consider dialectal variations. The large size of the dataset allows it to cover a wide variety of topics, unlike smaller datasets. We provide a comprehensive description of the dataset, comparisons with existing datasets, and baseline evaluations of prominent IR models. Our contributions aim to advance Spanish IR research and improve information access for Spanish speakers.
- North America > United States > California > Santa Barbara County > Santa Barbara (0.14)
- North America > Mexico (0.04)
- South America > Colombia > Bogotá D.C. > Bogotá (0.04)
- (34 more...)
Assessing and Enhancing Large Language Models in Rare Disease Question-answering
Wang, Guanchu, Ran, Junhao, Tang, Ruixiang, Chang, Chia-Yuan, Chang, Chia-Yuan, Chuang, Yu-Neng, Liu, Zirui, Braverman, Vladimir, Liu, Zhandong, Hu, Xia
Despite the impressive capabilities of Large Language Models (LLMs) in general medical domains, questions remain about their performance in diagnosing rare diseases. To answer this question, we aim to assess the diagnostic performance of LLMs in rare diseases, and explore methods to enhance their effectiveness in this area. In this work, we introduce a rare disease question-answering (ReDis-QA) dataset to evaluate the performance of LLMs in diagnosing rare diseases. Specifically, we collected 1360 high-quality question-answer pairs within the ReDis-QA dataset, covering 205 rare diseases. Additionally, we annotated meta-data for each question, facilitating the extraction of subsets specific to any given disease and its property. Based on the ReDis-QA dataset, we benchmarked several open-source LLMs, revealing that diagnosing rare diseases remains a significant challenge for these models. To facilitate retrieval augmentation generation for rare disease diagnosis, we collect the first rare diseases corpus (ReCOP), sourced from the National Organization for Rare Disorders (NORD) database. Specifically, we split the report of each rare disease into multiple chunks, each representing a different property of the disease, including their overview, symptoms, causes, effects, related disorders, diagnosis, and standard therapies. This structure ensures that the information within each chunk aligns consistently with a question. Experiment results demonstrate that ReCOP can effectively improve the accuracy of LLMs on the ReDis-QA dataset by an average of 8%. Moreover, it significantly guides LLMs to generate trustworthy answers and explanations that can be traced back to existing literature.
- North America > United States > Texas (0.04)
- North America > United States > Rocky Mountains (0.04)
- North America > United States > Colorado (0.04)
- (2 more...)
Interpretable Differential Diagnosis with Dual-Inference Large Language Models
Zhou, Shuang, Ding, Sirui, Wang, Jiashuo, Lin, Mingquan, Melton, Genevieve B., Zhang, Rui
Methodological advancements to automate the generation of differential diagnosis (DDx) to predict a list of potential diseases as differentials given patients' symptom descriptions are critical to clinical reasoning and applications such as decision support. However, providing reasoning or interpretation for these differential diagnoses is more meaningful. Fortunately, large language models (LLMs) possess powerful language processing abilities and have been proven effective in various related tasks. Motivated by this potential, we investigate the use of LLMs for interpretable DDx. First, we develop a new DDx dataset with expert-derived interpretation on 570 public clinical notes. Second, we propose a novel framework, named Dual-Inf, that enables LLMs to conduct bidirectional inference for interpretation. Both human and automated evaluation demonstrate the effectiveness of Dual-Inf in predicting differentials and diagnosis explanations. Specifically, the performance improvement of Dual-Inf over the baseline methods exceeds 32% w.r.t. BERTScore in DDx interpretation. Furthermore, experiments verify that Dual-Inf (1) makes fewer errors in interpretation, (2) has great generalizability, (3) is promising for rare disease diagnosis and explanation.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
- North America > United States > California > San Francisco County > San Francisco (0.28)
- Asia > China > Hong Kong (0.04)
- Europe > Spain > La Rioja > Logroño (0.04)
- Health & Medicine > Diagnostic Medicine (1.00)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.69)
- Health & Medicine > Health Care Technology > Medical Record (0.48)
- Health & Medicine > Therapeutic Area > Oncology (0.46)
DETECTA 2.0: Research into non-intrusive methodologies supported by Industry 4.0 enabling technologies for predictive and cyber-secure maintenance in SMEs
Huertas-García, Álvaro, Muñoz, Javier, Ambite, Enrique De Miguel, Camarmas, Marcos Avilés, Ovejero, José Félix
The integration of predictive maintenance and cybersecurity represents a transformative advancement for small and medium-sized enterprises (SMEs) operating within the Industry 4.0 paradigm. Despite their economic importance, SMEs often face significant challenges in adopting advanced technologies due to resource constraints and knowledge gaps. The DETECTA 2.0 project addresses these hurdles by developing an innovative system that harmonizes real-time anomaly detection, sophisticated analytics, and predictive forecasting capabilities. The system employs a semi-supervised methodology, combining unsupervised anomaly detection with supervised learning techniques. This approach enables more agile and cost-effective development of AI detection systems, significantly reducing the time required for manual case review. At the core lies a Digital Twin interface, providing intuitive real-time visualizations of machine states and detected anomalies. Leveraging cutting-edge AI engines, the system intelligently categorizes anomalies based on observed patterns, differentiating between technical errors and potential cybersecurity incidents. This discernment is fortified by detailed analytics, including certainty levels that enhance alert reliability and minimize false positives. The predictive engine uses advanced time series algorithms like N-HiTS to forecast future machine utilization trends. This proactive approach optimizes maintenance planning, enhances cybersecurity measures, and minimizes unplanned downtimes despite variable production processes. With its modular architecture enabling seamless integration across industrial setups and low implementation costs, DETECTA 2.0 presents an attractive solution for SMEs to strengthen their predictive maintenance and cybersecurity strategies.
- Europe > Spain > Galicia > Madrid (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Spain > La Rioja > Logroño (0.04)
- (2 more...)
- Information Technology > Security & Privacy (1.00)
- Government > Military > Cyberwarfare (1.00)